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The analysis of complex networks permeates all sciences, from biology to sociology. A fundamen- 
tal, unsolved problem is how to characterize the community structure of a network. Here, using 
both standard and novel benchmarks, we show that maximization of a simple global parameter, 
which we call Surprise (S), leads to a very efficient characterization of the community structure 
of complex synthetic networks. Particularly, S qualitatively outperforms the most commonly used 
criterion to define communities, Newman and Girvan's modularity (Q). Applying S maximization to 
real networks often provides natural, well-supported partitions, but also sometimes counterintuitive 
solutions that expose the limitations of our previous knowledge. These results indicate that it is 
possible to define an effective global criterion for community structure and open new routes for the 
understanding of complex networks. 
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Introduction 



A network of interacting units is often the best abstract 
representation of real-life situations or experimental data. 
This has led to a growing interest in developing methods 
for network analysis in scientific fields as diverse as math- 
ematics, physics, sociology and, most especially, biology, 
both to study organismic (e. g. populational, ecological) 
and cellular (metabolic, genomic) networks [THS]. A sig- 
nificant step to understand the properties of a network 
consists in determining its communities, compact clus- 
ters of densely linked, related units. However, the best 
way to establish the community structure of a network is 
still disputed. Many strategies have been used (reviewed 
in [5] ) , the most popular being the maximization of New- 
man and Girvan's modularity (Q) [7J. However, Q has 
the drawback of being affected by a resolution limit: its 
maximization fails to detect communities smaller than 
a threshold size that depends on the total size of the 
network and the pattern of connections 0. Since this 
finding, no other global parameters have been proposed 
to substitute Q. Alternative strategies (searching for lo- 
cal structural determinants, multilevel optimization of Q) 
have been suggested, but none of them has achieved gen- 
eral acceptance 0. 



equivalent to maximize the following parameter: 

Min{M,n) /M\ /F-M\ 

s= E ' (1) 

j—p \nJ 

Where F is the maximum possible number of links in a 
network (i. e. [fc^ — k]/2, being k the number of units), n 
is the observed number of links, M is the maximum possi- 
ble number of intracommunity links for a given partition, 
and p is the total number of intracommunity links actu- 
ally observed in that partition. The parameter S, which 
stands for Surprise, indeed measures the "surprise" (im- 
probability) of finding by chance a partition with the ob- 
served enrichment of intracommunity links in a random 
graph. 

In this work, we show that S has features that make it 
the parameter of choice for global estimation of commu- 
nity structure. By using standard and novel benchmarks 
and a set of high-quality algorithms for community detec- 
tion, we show that maximizing S often provides optimal 
characterizations of the existing communities. When this 
method is applied to real networks, we obtained some ex- 
pected, logical solutions - some of them much better than 
those provided by Q maximization - but also unexpected 
partitions that demonstrate the limitations that the us- 
age of inefficient tools has hitherto cast over the field. 



Some years ago, we suggested determining the com- 
munity structure of a network by evaluating the distri- 
butions of intra- and inter-community links with a cu- 
mulative hypergeometric distribution [9j. Accordingly, 
to find the optimal community structure of a network 
of symmetrically connected units (undirected graph) is 
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Results 

Testing the performance of a global parameter to de- 
termine community structure requires both a set of ef- 
ficient algorithms for community detection and a set of 
standard benchmarks, consisting in synthetic networks of 
known structure. In this study, six selected algorithms 
(see Methods) were tested in two types of benchmarks, 
which will be called LFR and RC throughout the text. 
LFR (Lancichinetti-Fortunato-Radicchi) benchmarks are 
characterized by providing networks in which both the 
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degrees of the nodes and the sizes of the communities fol- 
low power laws [lOj . RC (Relaxed Caveman) benchmarks 
start with networks in which all the nodes in a community 
are connected. Then, this structure is relaxed by generat- 
ing intercommunity links [11:. We further divided LFR 
and RC benchmarks into "open" and "closed". Open 
benchmarks have been commonly used in the past (e.g. 
[W, 12, 13 ). In them, sets of similar networks with differ- 
ent proportions of intercommunity links are tested. With 
many intercommunity links, the networks approach ran- 
domness. In closed benchmarks, a starting community 
structure is progressively transformed into a second, fi- 
nal structure which is exactly known. 

For each benchmark, we estimated S and Q with the 
six algorithms. The maximum values of S and Q obtained 
{Smax and Qmax) provided the partitions used to com- 
pare with the known community structures. As in pre- 
vious works [ini HH [H] , Normalized Mutual Information 
(NMI) was used to measure the congruence between the 
known and the estimated community structures. How- 
ever, we also used the Variation of Information (VI) [TB] 
in a particular case. 



Open benchmarks 

Figures [l}i and[l]3 summarize the results obtained for 
four standard open LFR benchmarks that differ in num- 
ber of units and community sizes [10] (see Methods). Fig- 
ure [l^ indicates that selecting the solution with a max- 
imum S value leads to a perfect characterization of the 
network structure (NMI5 = 1) even when that structure 
is blurred by a large number of inter-community links, 
generated by increasing the mixing parameter /i up to 
0.5-0.7 (see Methods for ^ definition). If /i is further 
increased, the original partition is not chosen by any al- 
gorithm (NMI5 < 1). This suggests that the original 
community structure is not present anymore, which is in 
good agreement with the fact that Smax ^ Sorig, where 
Sorig is the S value obtained assuming that the origi- 
nal community structure is still present (Table SI). S 
maximization qualitatively improves over Q maximiza- 
tion (Figure lb and Table SI): NMIg > NMIq in 
2827/3600 = 78.5% of the cases, NMIq > NMIg in just 
4.1% of them and the rest are ties. Interestingly, NMIq 
^ NMI5 in quasi-random and random networks (Figure 
lb), suggesting that maximizing Q overimposes spuri- 
ous community structures in those cases. It is significant 
that S maximization provided better average NMI scores 
than those obtained by any single algorithm in these same 
benchmarks [TS]. Different algorithms provided the top 
S scores, depending on the benchmark and /i value ex- 
amined (Figure [2^ and Figure SI). 

The discovery of the resolution limit of Q showed that 
heterogeneous community sizes may greatly affect the 
ability of global parameters to detect structure [8] . How- 
ever, by construction, community sizes in the standard 
LFR benchmarks are very similar. Pielou's evenness in- 
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FIG. 1; Results for open LFR and RC benchmarks, a) Re- 
sults for the four standard LFR networks. B and S indicate 
big and small communities respectively and 1000 or 5000 the 
number of nodes, /i: mixing parameter. NMI measures the 
congruence between the known and the deduced community 
structures. Each point is based on 100 dilferent networks; 
standard errors of the mean are too small to be visuaUzed. 
Values for 100 random (R) networks with the same number 
of units and degree distributions are also shown, b) Com- 
parison of S and Q maximizations in LFR benchmarks. The 
NMIq/NMIs ratios, which are almost always below 1, are 
shown, c) Results for the RC benchmark. The parameter 
Degradation (D) indicates the percentage of both deleted and 
shufHed links. Each black dot is based on 100 networks, again 
standard errors are so small that cannot be visualized at this 
scale. For each value of D, results for 100 random networks 
with the same number of links are also shown (open circles) . 
d) Relative quality of the partitions generated by maximizing 
S and Q in RC benchmarks. As in panel b, NMIq/NMIs 
ratios are shown. White dots: results for random networks 
with different D values. 



dexes (PI) j2JJ ranged from 0.96 to 0.98 in the four bench- 
marks used above, close to the maximum value of the 
index (PI = 1 for communities of identical size). Con- 
sidering that it was critical to test S in more extreme 
situations, we built the RC benchmarks, which have Pis 
as low as 0.70 (as shown in Figure S2). Figures [l|: and 
[TJi summarize the results for open RC benchmarks, with 
progressive Degradation (D; see Methods) of the origi- 
nal structure. That structure is efficiently detected by S 
maximization, with a slow decrease in performance when 
D increases (Figure Ic; see also Table S2, Figure S2). 
Again, S maximization clearly improves over Q maxi- 
mization in these benchmarks (Figure[T|i; NMI5 > NMIq 
in 848/900 = 94.2% of the cases, while NMIq > NMI5 in 
just 3.3% of the cases). As occurred for the LFR bench- 
marks, none of the algorithms obtained the best results 
in all networks (Figure [2]3) . 
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FIG. 2: Average performance of the algorithms in the open 
LFR and RC benchmarks. The algorithms used were de- 
scribed by Arnau et al. \^ , Aldecoa and Marn (AM) , Ros- 
vall and Bergstrom (RB) '17 , Ronhovde and Nussinov (RN) 
[15] , Blondel et al. [19] and Duch and Arenas (DA) 20 . a) 
Typical example of the results obtained in LFR benchmarks, 
here with 5000 units and big communities (see Figure SI for 
all of them) . After ordering the algorithms from best to worst 
performance, their ranks were added for the 100 different net- 
works. Performance was defined as P = 6 — averagerank. 
Therefore, the maximum value P = 5 means that an algo- 
rithm was the best in all networks tested, while P = means 
that it was always the worst. As it can be observed, none 
of the algorithms achieved optimal results in all cases, b) 
Results obtained in the RC benchmark with different Degra- 
dation (D) values. Performance evaluated as in panel a). 



Closed benchmarks 

The results just shown indicate that using Syyidx de- 
tect community structure has obvious advantages over 
maximizing Q. However, they do not allow to evaluate 
how optimal is that criterion, given that the potential 
maximum NMIs are unknown. To solve this limitation, 
we generated closed LFR and RC benchmarks, in which 
we had an a priori expectation of the maximum NMI 
values. Results are shown in Figures [3] (LFR) and [1] 
(RC). In all cases in which S^ax was used, an almost 
perfectly symmetrical dynamics was observed. In the 
process of converting the original structure into the final 
one (by increasing the Conversion parameter; see Meth- 
ods), NMI losses for the first structure are compensated 
by increases for the second. The average of both NMIs is 



thus approximately constant, and it has a value identical 
or very close to (l-|-NMI/i?)/2, where NMI/j? is obtained 
comparing the initial and final structures (Figures [3^-d; 
Figures [4ji-c; Fi gures S3, S4). This is exactly the result 
expected for an optimal parameter (see theoretical de- 
tails in Methods). On the contrary, maximizing Q shows 
a poor performance except when community sizes are 
very similar/identical (Figures [s];, [iji; Figures S3, S4). 
The same results were obtained using a second measure 
of congruence. Variation of Information (VI) (Figures S5, 
S6). Finally, in the LFR benchmarks, Smax was always 
identical or higher than Sorig (Figure [3^). However, this 
does not happen for the RC benchmarks (Figure |4^). 
Therefore, these algorithms sometimes fail to obtain the 
highest possible S values. This fact may explain the slight 
departures from NMI symmetry observed in some RC 
benchmarks (blue diamonds in Figures |4]3, |4j;). 

Real networks 

Figure 5 summarizes the Smax results for three real 
networks. The first example is based on the CYC2008 
database, which compiles 1604 proteins that belong to 
324 protein complexes [22]. The general agreement be- 
tween communities detected using Smax and a priori de- 
fined protein complexes is almost perfect, NMI5 = 0.91. 
On Figure[5^, the 11 communities of size >20, out of the 
313 detected, are detailed to show how fine-grained is 
the classification obtained. On the contrary, optimizing 
Q provides a very coarse classification into just 24 com- 
munities with NMIq = 0.57. The largest five communi- 
ties alone almost cover the whole network (Figure [sj)) . 
These results indicate how excellent is S performance 
when there are many small, abundant communities, a 
typical situation in which Q, affected by its resolution 
limit, radically fails. Figure [Sj: shows, as a positive con- 
trol, the results for a classical benchmark of well-known 
structure, the College football network [12j . The agree- 
ment with the expected communities is again very high 
(NMIs = 0.93). Finally, Figure [5}i shows the results for 
another well-known example, the Zachary's Karate club 
network [T^l This social network supposedly con- 

tains two communities. However, S analyses surprisingly 
unearthed 19 communities, 12 of them singletons (Figure 

Discussion 

In this study, we have shown the potential of maxi- 
mizing the global parameter Surprise (S) to determine 
the community structure present in complex networks. 
The results indicate that it has a qualitative better per- 
formance than the hitherto most commonly used global 
measure, Newman and Girvan's modularity (Q). The ad- 
vantage of S over Q is maybe not that surprising, consid- 
ering the different theoretical foundations of both mea- 
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FIG. 3: Results for closed LFR benchmarks, a) LFR bench- 
mark with 1000 units and big communities. For each Conver- 
sion (C) value, NMIs comparing the Smax partition with the 
initial (black dots) or final (red squares) community struc- 
tures were obtained. The symmetrical results led to NMI 
averages (blue diamonds) that, with great precision, fell in a 
straight line of value (1-|-NMI/f)/2. Dots are based on 100 in- 
dependent analyses, bd) LFR benchmarks with, respectively, 
1000 units, small communities (b), 5000 units, big communi- 
ties (c) and 5000 units, small communities (d). Results are 
very similar to those in panel a), e) Average NMI values for 
partitions obtained maximizing Q are worse than those ob- 
tained maximizing S, especially as we move towards C = 50, 
in which the real community structure is more difficult to es- 
tablish. This effect is exacerbated by large number of units 
and small community sizes, due to the resolution limit of Q. 
Results for C > 50 are symmetrical to the ones shown here. 
See also Figure S3, f) Smax/ Sorig ratio > 1, i. e. either the 
original structure or a different one with higher S is found. 
These results are compatible with the algorithms used being 
able to detect the true structure present with great accuracy. 



FIG. 4: Results for closed RC benchmarks. Three networks 
with different heterogeneity in community sizes (Pielou's in- 
dexes equal to 0.70, 0.85 and 1.00 respectively) were used as 
examples, a) PI = 1; b) PI = 0.85; c) PI = 0.70. Re- 
sults similar to those in Figure 2, except that the figures are 
not so perfectly symmetrical in the most heterogeneous net- 
works (panels b and c; blue diamonds slightly deviate from 
the straight line), d) Average NMI values are much worse 
when Q is used, provided that community sizes are heteroge- 
neous. See also Figure S4. e) Smax/ Sorig < 1 with heteroge- 
neous community sizes. The algorithms used did not detect 
in those cases the maximum possible S, which still may cor- 
respond to the initial structure. This may contribute to the 
departures from symmetry shown in panel a). The fact that 
Smax/Sorig > 1 with C < 0.50 and PI = 0.70 (blue dia- 
monds) implies that the algorithms are detecting structures 
different from the initial one. 



5 




FIG. 5: Community structure of the CYC2008 network (a, 
b), College football network (c) and Zachary's karate club 
network (d), according to S maximization (panels a, c, d) or 
Q maximization (panel b). In panel c, the known community 
structure is shown (squares). The broken lines in panel d 
divide the network into the two communities assumed to exist. 
That division of the network is not supported at all by Sm.ax 
analyses. While S(2communities) ~ 13.61, the optimal division 
found has S(igcom.munities) = 25.69. Twelve of these optimal 
communities are singletons (white dots). 



sures. Newman and Girvan's Q is based on a simple def- 
inition of community, as a region of the network with an 
unexpectedly high density of links. However, the num- 
ber of units within each community does not influence 
the value of Q [Tj . On the contrary, S evaluates both the 
number of hnks and of units in each community (seejl]). 
Therefore, S implicitly assumes a more complex defini- 
tion of community: a precise number of units for which 
it is found a density of links which is statistically unex- 
pected given the features of the network. In this context 
of comparison of both measures, it is also very significant 
that, while some of the algorithms used in this work were 
the best among those specifically designed to maximize 
Q, none was devised to maximize S. Therefore, our results 
actually underestimate the power of S maximization for 
community detection. A direct example of that under- 
estimation is shown in Figure [4^: the maximum values 
of S were, in some cases, not found. The few exceptions 
found in which NMIq > NMIs (3-4% of aU the cases ex- 
amined in the open benchmarks) could be also explained 
by an incomplete success in determining Smax with these 
algorithms. 

The commonly used open benchmarks are useful for 
general evaluations of the performance of different al- 
gorithms, but they do not allow to establish how op- 
timal are the results obtained. For that, we have de- 
vised novel closed benchmarks in which an initial known 



community structure is progressively transformed into 
a second, also known, community structure. Provided 
that both community structures are identical, it can be 
demonstrated that, at any point of the transformation 
from one to the other, the average of the NMIs of the 
solution found respect to the initial and final structures 
should approximate a constant value ([l-|-NMI/j?]/2), if 
that solution is optimal (see Methods). This feature al- 
lows establishing the intrinsic quality of the partitions 
obtained, with S maximization often providing optimal 
results. We conclude that S maximization establishes the 
community structure of complex networks with a high ac- 
curacy. Two promising lines of research are clear. First, 
generating novel, specific algorithms for S maximization, 
which may improve over the existing ones. Second, build- 
ing a standard set of closed benchmarks to test any new 
algorithms for community detection. Our LFR and RC 
closed benchmarks may be a good starting point for that 
standard set. 

When S maximization was applied to real networks, 
the results obtained are of two types. On one hand, for 
the CYC2008 and College football networks, the expec- 
tation was to find a clear community structure which 
should faithfully correspond to either the complexes to 
which the proteins examined are part (CYC2008 net- 
work) or to the conferences to which the teams belong 
{College football network), given that intracomplex or in- 
traconference links are abundant (e. g. Figure[5];). These 
are exactly the results found using Smax- On the other 
hand, the structure of the Zachary's karate network is 
far from obvious (Figure [sji). Therefore, finding that, ac- 
cording to Smax J the network contains some small groups 
plus many singletons is, at least a posteriori, not so un- 
expected. A natural question is then why the scientific 
community has been so keen of exploring this particu- 
lar network, often to establish whether an algorithm was 
able or not to detect the putative two communities (e. 
g. refs. [3 [12 HSl m] among many others). This may 
refiect a psychological bias, to which the use of under- 
performing methods for community detection may have 
certainly contributed. It shows to which extent human 
prejudices may taint evaluations in this type of ill-defined 
problems. 



Methods 

Algorithms used to maximize S and Q 

Six of the best available algorithms, selected either by 
their exceptional performance in artificial benchmarks 
or their success in previous analyses of real and simu- 
lated networks 9, 13 15, 25,i2S]j were used. They were 
the following: 1) UVCluster algorithm [HI [13]: It per- 
forms iterative hierarchical clustering, generating den- 
drograms. The best values of S and Q were obtained 
scanning these dendrograms from root to leaves. 2) 
SCluster algorithm 13j: also performs iterative hierarchi- 
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cal clustering, but using an alternative strategy which is 
faster and sometimes more accurate than the one imple- 
mented in UVCluster. 3) Dynamic algorithm by Rosvall 
and Bergstrom [17|: an algorithm based on expressing 
the characterization of communities as an information 
compression problem. 4) Potts model multircsolution al- 
gorithm |18j : works by minimizing the Hamiltonian of 
a Potts spin model at different resolution scales, i. e. 
searching for communities of different sizes. 5) Fast mod- 
ularity optimization |19j : devised to maximize Q. It pro- 
vides multiple solutions from which values for S and Q 
can be obtained, and the maximum ones were used in our 
analyses. 6) Extremal optimization algorithm [20j: A di- 
visive algorithm also developed to maximize Q. Analyses 
were always performed with the default program settings. 



Features of the benchmarks 

First, the recently developed LFR benchmarks, specif- 
ically devised for testing alternative community detec- 
tion strategies fTTT, were used. In particular, we chose 
four standard LFR benchmarks already explored by 
other authors [15 . The networks analyzed had either 
1000 or 5000 units and were built according to two al- 
ternative ranges of community sizes (Big (B): 20-100 
units/community; Small (S): 10-50 units/community). 
For each of the four conditions (1000 B, 1000 S, 5000 B, 
5000 S), 100 different networks were generated for each 
value of a mixing parameter ^, which varied from 0.1 to 
0.9 [T3]. iJ, is the average percentage of links that connect 
a unit to those in other communities. Logically, increas- 
ing /X weakens the network community structure. When 
^ = 0.9, the networks are quasi-random (see below). 

Once found that these LFR benchmarks generated net- 
works with communities of very similar sizes, we decided 
to implement RC benchmarks in which these sizes were 
more variable. All networks in these benchmarks had 
512 units divided into 16 communities. One hundred net- 
works with random community sizes, determined using a 
broken-stick model [37], were generated. This model pro- 
vides highly heterogeneous community sizes. Progressive 
weakening of the community structure of the RC net- 
works, similar to the effect of increasing fj, in the LFR 
networks, was obtained as follows. Initially, all units 
of each community in the network were fully connected. 
Then, that obvious structure was progressively blurred, 
by first randomly removing a certain percentage of edges 
and then randomly shuffling the same percentage of links 
among the units. That common percentage, we have 
called Degradation (D). Thus, D — 10% means that, first, 
10% of the links present were eliminated and then 10% of 
the remaining edges were randomly shuffled among units. 
Shuffling involved first the random removal of an edge of 
the graph and then the addition of a new edge between 
two randomly chosen nodes. 

In the LFR and RC benchmarks just described it was 
possible to compare networks having obvious commu- 



nity structures (generated with low /z or D parameters) 
with others that were increasingly random. This type 
of benchmarks, we have called open. We also generated 
closed LFR and RC benchmarks. In them, links were 
shifted in a directed way, in order to convert the origi- 
nal community structure of a network into a second, also 
predefined, structure. In this way, it is possible to mon- 
itor when the original structure is substituted by the fi- 
nal one according to the solutions provided by Smax or 
Qrnax- In the LFR and RC closed benchmarks, the start- 
ing networks were the same described in the previous 
paragraphs, with ^ = 0.1 (LFR) or D = (RC) respec- 
tively, and the final networks were obtained by randomly 
relabeling the nodes. Therefore, the initial and final net- 
works had identical community structures but the nodes 
within each community were different. Conversion (C) 
is defined as the percentage of links exclusively present 
in the initial network that are substituted by links only 
present in the final one (i. e. C = 0: initial structure 
present; C = 100: final structure present). 



NMI symmetry as a measure of performance in 
closed benchmarks 



In our closed benchmarks, a peculiar symmetrical be- 
havior of NMI values respect to the initial and final par- 
titions is expected. Imagine that a putative optimal par- 
tition is estimated according to a given criterion. Let us 
now consider the following triangle inequality: 



NMIiE + NMIef ^ 1 



NMI 



IF 



(2) 



where NMI/^ is the normalized mutual information 
calculated for the initial structure (I) and the estimated 
partition (E), NMI^;^ is the normalized mutual informa- 
tion for the final structure (F) versus the estimated parti- 
tion and NMI/ p is the normalized mutual information for 
the comparison between the initial and final structures. 
Inequality [2] holds true if the structures of I, F and E are 
identical (i. e. both the number and sizes of the com- 
munities are the same, but not necessarily are the same 
the nodes within each community) . This follows from the 
fact that 



1 + NMIxY < 



VIxY 
H{X)+H{Y) 



(3) 



Where VIxv is the Variation of Information for both 
partitions IF] and II(X) and H(Y) are the entropies of 
the X and Y partitions, respectively. Given that VI is a 
metric jl6] . it satisfies the triangle inequality 



VIab + VIbc > VIac 



(4) 



If, as indicated, the structures of all partitions are iden- 
tical, then all their entropies are also identical. In that 
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case, the following inequality can be deduced from for- 
mulae [3] and m 

(1 - NMIab) + (1 - NMIbc) > (1 - NMIac) (5) 

From this inequality, and substituting A, B and C with 
I, E and F, respectively, formula [2] can be deduced. For- 
mula [2] therefore means that, provided that I, E and 
F have the same structure, the average of NMI/b and 
NMl£;i7' may acquire a maximum value [(l-KNMI/f-)/2]. 
Inequality [2] will also hold approximately true if the en- 
tropies of I, E and F are very similar (i. e. many identi- 
cal communities). In our closed benchmarks the I and F 
structures are identical, and we progressively convert one 
into the other. It is thus expected that the optimal parti- 
tion along this conversion is similar in structure to both I 
and F. Hence, deviations from the expected average value 
(l-|-NMI/i;')/2 are a cause of concern, as they probably 
mean that the optimal partition has not been found. On 
the other hand, finding values equal to (l+NMI/i?)/2 is 
a strong indication that the optimal partition has indeed 
been found. 

It is worth noting that, although NMI has been com- 
monly used in this field (TUl HH [H] , using VI instead has 



clear advantages to analyze closed benchmarks: Formula 
|4] can be used instead of Formula [2] avoiding consider- 
ing entropies at all. This is why we evaluated the closed 
benchmark results both using NMI and VI (see above). 



Real networks 

Two of the three networks explored, known as Col- 
lege football and Zachary's karate networks, have been 
frequently used in the past in the context of community 
detection [e. g. refs. [1 [H HH]- The third 

network derived from the CYC2008 protein complexes 
database [211 ■ This database contains information for 408 
protein complexes of the yeast Saccharomyces cerevisiae. 
The protein complex data were converted into 324 non- 
overlapping complexes by assigning each protein present 
in multiple complexes to the largest one. This was made 
to allow for NMI calculations. Once each protein (unit) 
was assigned to a non-overlapping cluster (community), 
we downloaded from the BioGRID database [29] the 
protein-protein interactions (edges) characterized so far 
for all these proteins. The final graph contained 1604 
nodes and 14171 edges. 
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